PH345: Winter 2025
Is there a moment where a galloping horse has all hooves off the ground?
Eadweard Muybridge, 1878. Public Domain
Schematic representation of late-prophase chromosomes (1000-band stage) of man, chimpanzee, gorilla, and orangutan, arranged from left to right, respectively, to better visualize homology between the chromosomes of the great apes and the human complement.
Figure 2, Yunis and Prakash (1982)
[S]lice the data into parts according to one or more data dimensions, visualize each data slice separately, and then arrange the individual visualizations into a grid (Ch21, Wilke, 2019).
Called ‘small multiples’ (Tufte, 1991), ‘trellis plots’ (Becker et al., 1996), or ‘facet plots’ (Wickham, 2016).
Introduces a third dimension (first two dimensions being x and y). Essentially another aesthetic.
Tufte’s principles of graphical excellence:
Present many numbers in small space
Make large data sets coherent
Use when you want to focus audience attention on how differences
Faceting variable must be categorical
Each mini-plot should (normally) have same structure: common axes, scales, etc
New Zealand statistician, Chief Scientist at Posit PBC
Creator of ggplot2 and the tidyverse
John Chambers Award for Statistical Computing (2006); Fellow of ASA (2015); COPSS Presidents’ Award (2019)
ggplot2facet_wrap(): wrap a 1D ribbon of plots into 2D gridfacet_grid(): create a 2D grid of plotsSee Chapter 16 of ggplot2 book (https://ggplot2-book.org/facet)
facet_wrap()# Install and load datasauRus if not already installed
if(!require(datasauRus)) {install.packages("datasauRus");library(datasauRus)}
ggplot(datasaurus_dozen) +
geom_point(aes(x = x, y = y), size = 1) +
facet_wrap(facets = vars(dataset), ncol = 5) +
guides(color = "none") +
theme(text = element_text(size = 20)) facet_grid() doesn’t ‘work’ hereSpread by the bite of female Anopheles mosquitos, which are hosts to the malaria parasite
Exists in tropical and subtropical regions with inadequate public health infrastructure
263 million cases in 2023; 597,000 deaths
US CDC: https://www.cdc.gov/malaria/data-research/index.html;
World Malaria Report, WHO: https://www.who.int/teams/global-malaria-programme/reports/world-malaria-report-2024
WHO collects country-level data on malaria incidence.
Raw data on incidence per 1000-at risk persons available at https://data.worldbank.org/indicator/SH.MLR.INCD.P3
Friendlier data available on canvas (malaria_countries_long.csv)
# Update p1 object from previous slide with the provided hints
p3 <-
p2 +
# The '.' means we are taking the data inherited from the p1 object
# and further modifying it
geom_text(data = . %>% group_by(country) %>% slice(1),
aes(label = country), x = 2000, y = 650, hjust = 0, size = 3)+
theme(strip.text = element_blank())
p3# Create a new variable called delta_incidence that is the difference
# between the last and first incidence values for each country.
# Then arrange the data by this variable and turn country into a factor
# with levels in the order that they appear in the data:
mutate(country = factor(country) %>% fct_inorder())malaria_countries_ssafrica2 <-
malaria_countries_ssafrica %>%
group_by(country) %>%
# Note we are using mutate not summarize since we want to keep the year-level
# data. Because the data are grouped by `Country Code`, the result will just
# be constant within country
mutate(delta_incidence = last(incidence) - first(incidence)) %>%
ungroup() %>%
# Now sort by the calculated value, then country name, then year
arrange(delta_incidence, country, year) %>%
# Now we recharacterize Country name as a factor with levels as they
# are arranged
mutate(country = factor(country) %>% fct_inorder())
# update p3 object from previous slide with new data
# Note use of new operator `%+%` for updating data. Type ?'%+%' for more infor
p4 <- p3 %+% malaria_countries_ssafrica2
p4Think about differences between choice of varying y-axes: Varying y-axes are essentially many individual plots. Varying y-axes make it easier to evaluate differences within a panel but harder to compare between panels
Wrapped facet by US States, ordered by size of workforce. Easy to compare trends across states. Unemployment band provides anchors
Anna Maria Barry-Jester, https://fivethirtyeight.com/features/as-u-s-life-expectancies-climb-people-in-a-few-places-are-dying-younger/
Sort of a facet grid by categorized latitude and longtitude. The trend line provides an anchor. The amount of purple and orange allow to immediately determine
No Spice: Make the standard faceted plot on slide 15;
Weak Sauce: Make the faceted plot without strips on slide 16; Make the faceted plot with varying y-axes on slide 20;
Medium Spice: Make the faceted plot with inside labels on slide 17
Yoga Flame: Make the faceted plot with reordered facets on slide 18
Dim Mak: Make the faceted plot with the Sub-Saharan Africa trajectory on slide 19
Becker, R.A., Cleveland, W.S. and Shyu, M.J., 1996. The visual design and control of trellis display. Journal of computational and Graphical Statistics, 5(2), pp.123-155.
Tufte, E.R., 1991. Envisioning information. Optometry and Vision Science, 68(4), pp.322-324.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Wilke, C.O., 2019. Fundamentals of data visualization: a primer on making informative and compelling figures. O’Reilly Media.
Yunis, J.J. and Prakash, O., 1982. The origin of man: a chromosomal pictorial legacy. Science, 215(4539), pp.1525-1530.